Rank | Count | Beginning |
---|---|---|
200 | 7773 | A |
9090 | 2672 | Az |
15958 | 422 | Ez |
24461 | 419 | Nem |
17814 | 374 | Ha |
13970 | 317 | Egy |
13329 | 249 | De |
15648 | 162 | Es |
16595 | 159 | Ezt |
18953 | 155 | Hozzáadom |
15218 | 154 | Ennek |
19417 | 146 | Így |
22413 | 129 | Még |
11675 | 113 | Azt |
16847 | 110 | Ezzel |
23478 | 110 | Mint |
28722 | 100 | Úgy |
12946 | 99 | Cikk |
23839 | 96 | Most |
18744 | 92 | Hogy |
22106 | 87 | Már |
5373 | 85 | Amikor |
12237 | 82 | Bár |
16183 | 81 | Ezek |
24126 | 81 | Nagyon |
16359 | 80 | Ezért |
19815 | 78 | Itt |
15138 | 77 | Én |
23704 | 74 | Mivel |
13085 | 73 | Csak |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV